Extraction of ε-Cycles from Finite-State Transducers
نویسنده
چکیده
Much attention has been brought to determinization and ε-removal in previous work. This article describes an algorithm for extracting all ε-cycles, which are a special type of non-determinism, from an arbitrary finite-state transducer (FST). The algorithm factorizes (decomposes) the FST, T , into two FSTs, T1 and T2, such that T1 contains no ε-cycles and T2 contains all ε-cycles of T . Since ε-cycles are an obstacle for some algorithms such as the factorization of ambiguous FSTs, the proposed approach allows us to by-pass this problem. ε-Cycles can be extracted before and re-inserted (by composition) after such algorithms.
منابع مشابه
Automatic Extraction of Hypernyms and Hyponyms from Russian Texts
The paper describes a rule-based approach for hypernym and hyponym extraction from Russian texts. For this task we employ finite state transducers (FSTs). We developed 6 finite state transducers that encode 6 lexicosyntactic patterns, which show a good precision on Russian DBpedia: 79.5% of the matched contexts are correct.
متن کاملFinite state complexity
In this paper we develop a version of Algorithmic Information Theory (AIT) based on finite transducers instead of Turing machines; the complexity induced is called finite-state complexity. In spite of the fact that the Universality Theorem (true for Turing machines) is false for finite transducers, the Invariance Theorem holds true for finite-state complexity. We construct a class of finite-sta...
متن کاملSubject And Object Dependency Extraction Using Finite-State Transducers
We describe and evaluate an approach for fast automatic recognition and extraction of subject and object dependency relations from large French corpora, using a sequence of finite-state transducers. The extraction is performed in two major steps: incremental finite-state parsing and extraction of subject/verb and object/verb relations. Our incremental and cautious approach during the first phas...
متن کاملComposition Closure of ε-Free Linear Extended Top-Down Tree Transducers
The expressive power of compositions of linear extended topdown tree transducers with and without regular look-ahead is investigated. In particular, the restrictions of ε-freeness, strictness, and nondeletion are considered. The composition hierarchy is finite for all ε-free variants of these transducers except for ε-free nondeleting linear extended top-down tree transducers. The least number o...
متن کاملA Two Phase Method for Information Extraction
In biology and functional genomics in particular, understanding the dependence and interplay between different genome and ecological characteristics of organisms is a very challenging problem. There are some public databases which combine this kind of information, but there is still much more information about microbes and other organisms that reside in unstructured and semi-structured document...
متن کامل